A Robust Boosting Method for Mislabeled Data

نویسندگان

  • Natsuki Sano
  • Hideo Suzuki
  • Masato Koda
چکیده

Abstract We propose a new, robust boosting method by using a sigmoidal function as a loss function. In deriving the method, the stagewise additive modelling methodology is blended with the gradient descent algorithms. Based on intensive numerical experiments, we show that the proposed method is actually better than AdaBoost and other regularized method in test error rates in the case of noisy, mislabeled situation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boosted Noise Filters for Identifying Mislabeled Data

In many practical classification problems, mislabeled data instances (i.e., class noise) exist in the acquired (training) data and often have a detrimental effect on the classification performance. Identifying such noisy instances and removing them from training data can significantly improve the trained classifiers. One such effective noise detector is the so-called ensemble filter, which pred...

متن کامل

On Boosting and Noisy Labels

Boosting is a machine learning technique widely used across many disciplines. Boosting enables one to learn from labeled data in order to predict the labels of unlabeled data. A central property of boosting instrumental to its popularity is its resistance to overfitting. Previous experiments provide a margin-based explanation for this resistance to overfitting. In this thesis, the main finding ...

متن کامل

Kernel Based Detection of Mislabeled Training Examples

The problem of identifying mislabeled training examples has been examined in several studies, with a variety of approaches developed for editing the training data to obtain better classifiers. Many of these approaches involve applying an individual or an ensemble of classifiers to the training set and filtering the mislabeled examples based on their consistency with respect to the classifier’s ...

متن کامل

Experiments on Ensembles with Missing and Noisy Data

One of the potential advantages of multiple classifier systems is an increased robustness to noise and other imperfections in data. Previous experiments on classification noise have shown that bagging is fairly robust but that boosting is quite sensitive. Decorate is a recently introduced ensemble method that constructs diverse committees using artificial data. It has been shown to generally ou...

متن کامل

Active cleaning of label noise

Mislabeled examples in the training data can severely affect the performance of supervised classifiers. In this paper, we present an approach to remove any mislabeled examples in the dataset by selecting suspicious examples as targets for inspection. We show that the large margin and soft margin principles used in support vector machines (SVM) have the characteristic of capturing the mislabeled...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004